Goto

Collaborating Authors

 sufficient condition


Price of Quality: Sufficient Conditions for Sparse Recovery using Mixed-Quality Data

arXiv.org Machine Learning

We study sparse recovery when observations come from mixed-quality sources: a small collection of high-quality measurements with small noise variance and a larger collection of lower-quality measurements with higher variance. For this heterogeneous-noise setting, we establish sample-size conditions for information-theoretic and algorithmic recovery. On the information-theoretic side, we show that it is sufficient for $(n_1, n_2)$ to satisfy a linear trade-off defining the Price of Quality: the number of low-quality samples needed to replace one high-quality sample. In the agnostic setting, where the decoder is completely agnostic to the quality of the data, it is uniformly bounded, and in particular one high-quality sample is never worth more than two low-quality samples for this sufficient condition to hold. In the informed setting, where the decoder is informed of per-sample variances, the price of quality can grow arbitrarily large. On the algorithmic side, we analyze the LASSO in the agnostic setting and show that the recovery threshold matches the homogeneous-noise case and only depends on the average noise level, revealing a striking robustness of computational recovery to data heterogeneity. Together, these results give the first conditions for sparse recovery with mixed-quality data and expose a fundamental difference between how the information-theoretic and algorithmic thresholds adapt to changes in data quality.





The optimal betting wealth growth rate

arXiv.org Machine Learning

This paper characterizes the best possible rate of growth of wealth in a Kelly betting game when repeatedly betting against a general i.i.d. null hypothesis $\mathscr{P}$, but the data are drawn i.i.d from an arbitrary alternative $Q$. We prove that it equals $\lim_{n \to \infty}n^{-1}\inf_{P \in (\mathscr P)^n)^{\circ\circ}} \mathrm{KL}(Q^n,P)$, where ${\mathscr P}^n = \{P^n: P \in \mathscr{P}\}$ and $(\mathscr {P}^n)^{\circ\circ}$ is its bipolar, i.e., this rate is achievable and one cannot do better. This quantity is in general smaller than a more popular quantity in the literature, $\mathrm{KL}_{\inf}(Q,\mathscr{P}) := \inf_{P \in \mathscr P}\mathrm{KL}(Q,P)$. If $\mathrm{KL}_{\mathrm{inf}}(\cdot,\mathscr P)$ is weakly lowersemicontinuous (w.l.s.c.) at $Q$, we show that the two quantities are equal; in particular, this happens when $\mathscr P$ is weakly compact. For simple alternatives, we provide the first matching necessary and sufficient condition for when power-one sequential tests exist (without assumptions on $\mathscr P, Q$). We also derive the optimal worst-case growth rate against composite $\mathscr Q$. We emphasize that test supermartingales on reduced filtrations suffice for all i.i.d. testing problems, and more general e-processes are not required. We thus completely generalize the recent results of Larsson et al.~\cite{larsson2025numeraire} to the sequential setting.


Pareto-Optimal Learning-Augmented Algorithms for Online Conversion Problems

Neural Information Processing Systems

This paper leverages machine-learned predictions to design competitive algorithms for online conversion problems with the goal of improving the competitive ratio when predictions are accurate (i.e., consistency), while also guaranteeing a worstcase competitive ratio regardless of the prediction quality (i.e., robustness). We unify the algorithmic design of both integral and fractional conversion problems, which are also known as the 1-max-search and one-way trading problems, into a class of online threshold-based algorithms (OTA). By incorporating predictions into design of OTA, we achieve the Pareto-optimal trade-off of consistency and robustness, i.e., no online algorithm can achieve a better consistency guarantee given for a robustness guarantee. We demonstrate the performance of OTA using numerical experiments on Bitcoin conversion.



Results

Neural Information Processing Systems

For any > 0, the -covering number of the Euclidean ball Bd(R):= {x 2Rd: kxk2 R} with radius R> 0 in the Euclidean metric is upper bounded by (1+2R/)d. Let F0 F 1 ... FT be a filtration and let X1,X2,...,XT be real random variables such that Xt is Ft-measurable, E[Xt|Ft 1]=0, |Xt| balmost surely, and PT t=1 E[X2t |Ft 1] V for some fixed V> 0and b> 0. Then for any 2(0,1), we have with probability at least 1, For any linear MDP satisfying Definition 3.1, we must have that k (s,a)k2 1/ p d for all s and a, and k,hk2 1/ p d for all and h. By Definition 3.1, we know that Ph( |s,a)= h (s,a),µh()i forms a valid probability distribution, and that k R S |dµh(s)|k2 p d. This yields the first equality. Repeating this calculation h 1more times yields the final equality. Lemma A.8. Fix some h and i (s,a)| 1, and kvk2 p d. Proof. By the linear MDP structure (see Proposition 2.3 of Jin et al. (2020)), for any j, Q j (s,a)= h (s,a),w j i = h (s,a), ji+ Z We first consider the case where u = h for some h which is a valid reward satisfying Definition 3.1. Assume that the reward in our MDP is set such that for h0 6= h, h0 =0 .